Skip to content

runtime: backfill missing block_id before snapshot packaging#11631

Closed
KirillLykov wants to merge 1 commit into
anza-xyz:masterfrom
KirillLykov:klykov/fix-validator-bind-id-panic
Closed

runtime: backfill missing block_id before snapshot packaging#11631
KirillLykov wants to merge 1 commit into
anza-xyz:masterfrom
KirillLykov:klykov/fix-validator-bind-id-panic

Conversation

@KirillLykov
Copy link
Copy Markdown

Problem

This fixes a validator crash in snapshot handling when a rooted bank has no block_id. This was triggered a fresh local cluster one-node validator running master.
It seems that in some flows rooted back may still have block_id == None and in handle_snapshot_request we don't ensure that is populated.

Summary of Changes

This PR ensures block_id != None and adds a test.

This fixes a validator crash in snapshot handling when a rooted bank has no block_id.
This was triggered a fresh local cluster one-node validator running
master.
It seems that in some flows rooted back may still have `block_id == None` and in `handle_snapshot_request` we don't ensure that is populated.
This PR ensures this and adds a test.
Comment on lines +275 to +279
// Backfill block id for legacy/replayed banks before snapshot packaging. This prevents
// snapshot panics when a rooted bank is missing block_id.
if snapshot_root_bank.block_id().is_none() {
Bank::calculate_and_set_block_id_for_dcou(&snapshot_root_bank);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AccountsBackgroundService should not be setting the block id. If block id is not set by the time we get here, it means there's a bug somewhere else.

@brooksprumo
Copy link
Copy Markdown

PR #11613 is supposed to fix this issue. When you saw the crash, were you running with a build older by chance?

@KirillLykov
Copy link
Copy Markdown
Author

PR #11613 is supposed to fix this issue. When you saw the crash, were you running with a build older by chance?

not sure if I rebased or not. So will close for now this PR and if observe this failure again, rethink.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants